Polarisation mode dispersion compensator using a mean square error technique

ABSTRACT

A method for the adaptive adjustment of a PMD compensator in optical fiber communication systems comprises the steps of taking the signal at the compensator output and extracting the components y 1 (t) and y 2 (t) on the two orthogonal polarizations, computing the signal y(t)=[y 1 (t)] 2 +[y 2 (t)] 2 , sampling the signal y(t) at instants t 1 =kT with T=symbol interval to obtian samples y(t k ), computing the mean square error e(k)=y(t k )−u(k) with u(k) equal to the symbol transmitted, and adjusting the parameters of the compensator to seek to minimize e(k). A PMD compensator in accordance with the method comprises an adjustment system which comprises in turn a photodetecor ( 17 ) which takes the components y 1 (t) and y 2 (t) on the two orthogonal polarizations from the signal at the compensator output, a sampler ( 19 ) which samples at instants t k =kT with T=symbol interval, the signal y(t)=[y 1 (t)] 2 +[y 2 (t)] 2  at the output of the photodetector ( 17 ) to obtain samples y(t k ), a circuit ( 18, 20 ) for computation of the mean square error e(k)=y(t k )−u(k) with u(k) equal to the symbol transmitted, and a regulator ( 15, 16 ) which regulates the parameters of the compensator to seek to minimize e(k).

The present invention relates to methods of adaptive adjustment of PMD compensators in optical fiber communication systems. The present invention also relates to a compensator in accordance with said method.

In optical fiber telecommunications equipment the need to compensate the effects of polarization mode dispersion (PMD) which occur when an optical signal travels in an optical fiber based connection is known.

It is known that PMD causes distortion and dispersion of optical signals sent over optical fiber connections malting the signals distorted and dispersed. The different time delays among the various signal components in the various polarization states acquire increasing importance with the increase in transmission speeds. In modern optical fiber based transmission systems with ever higher frequencies (10 Gbit/s and more), accurate compensation of PMD effects becomes very important and delicate. This compensation must be dynamic and performed at adequate speed.

The general purpose of the present invention is to remedy the above mentioned shortcomings by making available a method of fast, accurate adaptive adjustment of a PMD compensator and a compensator in accordance with said method.

In view of this purpose it was sought to provide in accordance with the present invention a method for the adaptive adjustment of a PMD compensator in optical fiber communication systems with the compensator comprising a cascade of adjustable optical devices over which passes an optical signal to be compensated comprising the steps of extracting the y₁(t) and y2(t) on the two orthogonal polarizations of the signal at the compensator output, obtaining the signal y(t)=|y₁(t)|²+|y₂(t)|², sampling the signal y(t) at instants t_(k)=kT with T=symbol interval to obtain samples y(t_(k)), computing the mean square error e(k)=y(t_(k))−u(k) with u(k) equal to the symbol transmitted or with u(k) replaced by a decision u(k) on the transmitted symbol u(k), and producing control signals for parameters of at least some of said adjustable optical devices to tend toward minimization of e(k).

In accordance with the method it was also sought to realize a PMD compensator in optical fiber communication systems applying the method and comprising a cascade of adjustable optical devices over which passes an optical signal to be compensated and an adjustment system comprising a photodetector (17) which takes the components y₁(t) and Y₂(t) on the two orthogonal polarizations from the signal at the compensator output, a sampler 19 which samples at instants t_(k)=kT with T=symbol interval, the signal y(t)=|y₁(t)|²+|y₂(t)|² at the output of the photodetector 17 to obtain samples y(t_(k)), a circuit 18, 20 for computation of the mean square error e(k)=y(t_(k))−u(k) with u(k) equal to the symbol transmitted and a regulator 15, 16 which regulates parameters of at least some of said optical devices to tend towards minimization of e(k).

To clarify the explanation of the innovative principles of the present invention and its advantages compared with the prior art there is described below with the aid of the annexed drawings a possible embodiment thereof by way of non-limiting example applying said principles. In the drawings

The components y₁(t) and y₂(t) on the two orthogonal polarizations are sent

FIG. 1 shows a block diagram of a PMD compensator with associated control circuit, and

FIG. 2 shows an equivalent model of the PMID compensator.

With reference to the FIGS FIG. 1 shows the structure of a PMD compensator designated as a whole by reference number 10. This structure consists of the cascade of some optical devices which receive the signal from the transmission fiber 11. The first optical device is a polarization controller 12 (PC) which allows modification of the optical signal polarization at its input. Thus there are three polarization maintaining fibers 13 (PMF) separated by two optical rotators 14.

A PMF fiber is a fiber which introduces a predetermined differential group delay (DGD) between the components of the optical signal on its two principal states of polarization (PSP) termed slow PSP and fast PSP.

In the case of the compensator shown in FIG. 1 the DGD delays at the frequency of the optical carrier introduced by the three PMFs are respectively τ_(c), ατ_(c) and (1−α) τ_(c), with 0<α<1 and with r and a which are design parameters.

An optical rotator is a device which can change the polarization of the optical signal upon its input by an angle θ_(i) (the figure shows θ_(i) for the first rotator and θ₂ for the second) on a maximum circle on the Poincaré sphere.

An optical rotator is implemented in practice by means of a properly controlled PC.

In FIG. 1, x₁(t) and x₂(t) designate the components on the two PSPs of the optical signal at the compensator input whereas similarly y₁(t) and y₂(t) are the components of the optical signal at the compensator output.

The components y₁(t) and y₂(t) on the two orthogonal polarisations are sent to the input of the photo detector (PD) which produces a signal y(t) given by: y(t)=|y ₁(t)|² +|y ₂(t)|²   (2)

This signal can if necessary be filtered by a post-detection filter. Without loss of generality let us assume that this filter is not present. If present this filter produces obvious changes in the adaptive adjustment of the compensator parameters. These changes are readily imaginable to those skilled in the art and are therefore not further described.

As shown again in FIG. 1 the signal y(t) is then sampled by a sampler 19 at the instants t_(k)=kT, where T is the symbol interval. Based on the sample y(t_(k)) a decision on the transmitted bit u(k) is made. Let us designate here by û(k) this decision made by a known decider circuit 18 (DEC).

The input-output behavior of each of the optical devices is described here by means of the so-called Jones transfer matrix H(w) which is a 2×2 matrix characterized by frequency-dependent components. Denoting by W₁(ω) and W₂(ω) the Fourier transforms of the components of the optical signal at the device input the Fourier transforms Z₁(ω) and Z₂(ω) of the components of the optical signal at the device output are given by: $\begin{matrix} {\begin{pmatrix} {Z_{1}(\omega)} \\ {Z_{2}(\omega)} \end{pmatrix} = {{H(\omega)}\begin{pmatrix} {W_{1}(\omega)} \\ {W_{2}(\omega)} \end{pmatrix}}} & (2) \end{matrix}$

Thus the Jones transfer matrix of the PC is $\begin{matrix} \begin{pmatrix} h_{1} & h_{2} \\ {- h_{2}^{*}} & h_{1}^{*} \end{pmatrix} & (3) \end{matrix}$ where h₁ e h₂ satisfy the condition |h₁|²+|h₂|²=1 and are frequency independent.

Denoting by φ₁ and φ₂ the PC control angles, h₁ and h₂ are expressed by: h ₁=−cos(φ₂−φ₁)+j sin(φ₂−φ₁)sin φ₁ h ₂ =−j sin(φ₂−φ₁)cos φ₁   (4)

Clearly if the PC is controlled using other angles or voltages, different relationships will correlate these other parameters with h₁ and h₂. The straightforward changes in the algorithms for adaptive adjustment of the PMD compensator are discussed below. Similarly, an optical rotator with rotation angle θ_(i) is characterized by the following Jones matrix: $\begin{matrix} \begin{pmatrix} {\cos\quad\theta_{i}} & {\sin\quad\theta_{t}} \\ {{- \sin}\quad\theta_{i}} & {\cos\quad\theta_{t}} \end{pmatrix} & (5) \end{matrix}$

The Jones transfer matrix of a PMF with DGD τ_(i) may be expressed as RDR⁻¹ where D is defined as: $\begin{matrix} {D\overset{\bigwedge}{=}\begin{pmatrix} {\mathbb{e}}^{{j\omega\tau}_{i}/2} & 0 \\ 0 & {\mathbb{e}}^{{- {j\omega\tau}_{i}}/2} \end{pmatrix}} & (6) \end{matrix}$ and R is a unitary rotation matrix accounting for the PSPs' orientation. This matrix R may be taken as the identity matrix I without loss of generality when the PSPs of all the PMPs are aligned.

As shown in FIG. 1, to control the PMD compensator a controller 15 is needed to produce optical device control signals of the compensator calculated on the basis of the quantities sent to it by a controller pilot 16 termed controller driver (CD).

The CD feeds the controller with the quantities needed to update the compensator optical device control parameters. As described below, these quantities will be extracted by the CD from the signals at the input and/or output of the compensator.

The controller will operate following the criterion described below and will use one of the two algorithms described below.

To illustrate the PMD compensator adaptive adjustment algorithms let us assume that the controller can directly control the parameters φ₁, φ₂, θ₁ and θ₂ which we consolidate in a vector θ defined as: θ{circumflex over (=)}(φ₁, φ₂, θ₁, θ₂)^(T)

If it is not so, in general there will be other parameters to control, for example some voltages, which will be linked to the previous ones in known relationships. The simple changes necessary are discussed below.

The PMD being a slowly changing phenomenon adjustment of the compensator parameters will be performed at a rate lower then the transmitted symbol rate 1/T. Let us assume that this adjustment is performed at the discrete-time instants t_(n)L=nLT where L≧1. We designate by: θ(nL)=(φ₁(nL), φ₂(nL), θ₁(nL), θ₂(nL))^(T) the value of the compensator parameters after the nth update.

In accordance with the method of the present invention the compensator parameter adjusting criterion uses the mean square error (MSE) criterion.

Based thereon, the compensator parameters 6 are adjusted to minimize the mean square error e(k) defined as: e(k)=y(t _(k))−u(k)   (7)

This error is a function of θ through y(t_(k)). We explicit this dependence by defining F(θ)=e(k). Therefore, the performance index to be minimized is the mean value of F²(θ). The compensator parameters θ will be updated by the rule: $\begin{matrix} {\begin{matrix} {{{\phi_{1}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\phi_{1}({nL})} - {\gamma\frac{{\partial E}\left\{ {F^{2}(\theta)} \right\}}{\partial\phi_{1}}}}}}_{\theta = {\theta{({nL})}}} \\ {{= {{\phi_{1}({nL})} - {2\gamma\quad{F(\theta)}\frac{{\partial E}\left\{ {F(\theta)} \right\}}{\partial\phi_{1}}}}}}_{\theta = {\theta{({nL})}}} \\ {= {{\phi_{1}({nL})} - {2{\gamma\left\lbrack {{y\left( t_{nL} \right)} - {u({nL})}} \right\rbrack}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\phi_{1}}}}} \end{matrix}\begin{matrix} {{{\phi_{2}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\phi_{2}({nL})} - {\gamma\frac{{\partial E}\left\{ {F^{2}(\theta)} \right\}}{\partial{\phi 2}}}}}}_{\theta = {\theta{({nL})}}} \\ {{= {{\phi_{2}({nL})} - {2\gamma\quad{F(\theta)}\frac{{\partial E}\left\{ {F(\theta)} \right\}}{\partial\phi_{2}}}}}}_{\theta = {\theta{({nL})}}} \\ {= {{\phi_{2}({nL})} - {2{\gamma\left\lbrack {{y\left( t_{nL} \right)} - {u({nL})}} \right\rbrack}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\phi_{2}}}}} \end{matrix}\begin{matrix} {{{\theta_{1}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\theta_{1}({nL})} - {\gamma\frac{{\partial E}\left\{ {F^{2}(\theta)} \right\}}{\partial\theta_{1}}}}}}_{\theta = {\theta{({nL})}}} \\ {{= {{\theta_{1}({nL})} - {2\gamma\quad{F(\theta)}\frac{{\partial E}\left\{ {F(\theta)} \right\}}{\partial\theta_{1}}}}}}_{\theta = {\theta{({nL})}}} \\ {= {{\theta_{1}({nL})} - {2{\gamma\left\lbrack {{y\left( t_{nL} \right)} - {u({nL})}} \right\rbrack}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\theta_{1}}}}} \end{matrix}\begin{matrix} {{{\theta_{2}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\theta_{2}({nL})} - {\gamma\frac{{\partial E}\left\{ {F^{2}(\theta)} \right\}}{\partial\theta_{2}}}}}}_{\theta = {\theta{({nL})}}} \\ {{= {{\theta_{2}({nL})} - {2\gamma\quad{F(\theta)}\frac{{\partial E}\left\{ {F(\theta)} \right\}}{\partial\theta_{2}}}}}}_{\theta = {\theta{({nL})}}} \\ {= {{\theta_{2}({nL})} - {2{\gamma\left\lbrack {{y\left( t_{nL} \right)} - {u({nL})}} \right\rbrack}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\theta_{2}}}}} \end{matrix}} & (8) \end{matrix}$ where E{.} denotes “expected” and y≧0 is a scale factor which controls the adjustment amount.

In vector notation this means that the vector of the compensator parameters is updated by adding a new vector with its norm proportional to the norm of the gradient F²(θ) and with opposite direction, i.e. all its components have their sign changed: θ[(n+1)L]=θ(nL)−γ∇E{F ²(θ)}]_(θ=θ(nL))=θ(nL)−2γF(θ)∇E{F(θ)}]_(θ=θ(nL))   (9)

This way we are sure to move towards a relative minimum of the function F²(θ). Three variations of the basic updating method defined in (8) can be obtained by using only sign information contained in the error e(k) and/or the partial derivative. Hence the three possible variations are (considering for example the updating rule related to φ₁): $\begin{matrix} {{{\varphi_{1}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\varphi_{1}({nL})}\quad 2{{\gamma sign}\left\lbrack {{y\left( t_{nL} \right)}\quad{u({nL})}} \right\rbrack}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\varphi_{1}}}}{or}} & (10) \\ {{{\varphi_{1}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\varphi_{1}({nL})}\quad 2{\gamma\left\lbrack {{y\left( t_{nL} \right)}\quad{u({nL})}} \right\rbrack}{sign}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\varphi_{1}}}}{or}} & (11) \\ {{\varphi_{1}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\varphi_{1}({nL})}\quad 2{{\gamma sign}\left\lbrack {{y\left( t_{nL} \right)}\quad{u({nL})}} \right\rbrack}{sign}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\varphi_{1}}}} & (12) \end{matrix}$

We shall now describe two methods of computing the gradient of the function F²(0) to find the required control parameters.

First Method

Let us consider the updating rule in vector notation (9). To simplify this rule, in the error F(θ)=e(k) we substitute the transmitted information symbol u(nL) with the corresponding decision û(nL), i.e. we substitute the error e(k) with the estimated error e(k) defined as: ê(k)=y(t _(k))−{fraction (u)}(k)   (13) In the diagram of FIG. 1 this estimated error is obtained at the output of the subtractor block 20 and sent to the CD.

Defining G[θ(nL)]=E{e²(nL)}, the updating rule (9) becomes: θ[(n+1)L]=θ(nL)−γ∇G(θ)|_(θ=θ(nL))   (14)

The partial derivatives of G(θ) for θ=θ(t_(n)) can be computed using the following 5-step procedure.

Step 1: Find the value of G[θ(nL)]=G[φ₁(nL), φ₂(nL), φ₁(nL>), θ₂(nL)] alliterazione n. at iteration n. To do this, in the time interval (nLT, nLT+LT/5) an estimate of G[θ(nL)] is computed by averaging the L/5 values of the estimated square error, i.e., $\begin{matrix} {{G\left\lbrack {\theta({nL})} \right\rbrack} = \frac{\sum\limits_{i = 0}^{{L/5} - 1}\quad{\hat{e}\left( {{nL} + i} \right)}}{L/5}} & (15) \end{matrix}$

Step 2: find the partial derivative: ${\frac{\partial{G(\theta)}}{\partial\phi_{1}}}_{\theta = {\theta{({nL})}}}$ at iteration n. To do this, parameter φ₁ is set at φ₁(nL)+Δ while the other parameters are left unchanged. The corresponding value of G(θ), i.e. G[θ₁(nL)+Δ, φ₂(nL), θ₁(nL), θ₂(nL)] is computed as in step 1 but in the time interval (nLT+LT/5, nLT+2LT/5. The estimate of the partial derivative of G(θ) as a function of φ₁ is computed as: $\begin{matrix} {{\frac{\partial{G(\theta)}}{\partial\phi_{1}}}_{\theta = {\theta{({nL})}}} \cong \frac{\begin{matrix} {{G\left\lbrack {{{\phi_{1}({nL})} + \Delta},{\phi_{2}({nL})},{\theta_{1}({nL})},{\theta_{2}({nL})}} \right\rbrack} -} \\ {G\left\lbrack {{\phi_{1}({nL})},{\phi_{2}({nL})},{\theta_{1}({nL})},{\theta_{2}({nL})}} \right\rbrack} \end{matrix}}{\Delta}} & (16) \end{matrix}$

Step 3. Find the partial derivative: ${\frac{\partial{G(\theta)}}{\partial\phi_{2}}}_{\theta = {\theta{({nL})}}}$ at iteration n. To do this the parameter φ₂ is set at φ₂(nL)+Δ while the other parameters are left changed. The corresponding value of G(θ), i.e. G[φ₁(nL), φ₂(nL)+Δ, θ₁(nL), θ₂(nL)], is computed as in step 1 but in the time interval (nLT+2LT/5, nLT+3LT/5). The estimate of the partial derivative of G(O) with respect to φ₂ is computed as: $\begin{matrix} {\frac{\partial{G(\theta)}}{\partial\phi_{2}}❘_{\theta = {\theta{({nL})}}}{\cong \frac{\begin{matrix} {{G\left\lbrack {{\phi_{1}({nL})},{{\phi_{2}({nL})} + \Delta},{\theta_{1}({nL})},{\theta_{2}({nL})}} \right\rbrack} -} \\ {G\left\lbrack {{\phi_{1}({nL})},{\phi_{2}({nL})},{\theta_{1}({nL})},{\theta_{2}({nL})}} \right\rbrack} \end{matrix}}{\Delta}}} & (17) \end{matrix}$

Step 4: Find the partial derivative: $\frac{\partial{G(\theta)}}{\partial\theta_{1}}❘_{\theta = {\theta{({nL})}}}$ at iteration n. To do this, parameter θ₁ is set at θ_(i)(nL)+Δ while the other parameters are left unchanged. The corresponding value of G(θ), i.e. G[φ₁(nL), φ₂(nL), φ₁(nL)+Δ, φ₂(nL)], is computed as in Step 1 but in the time interval (nLT+3LT/5, nLT+4LT/5). The estimate of the partial derivative of G(θ) with respect to G(θ) is computed as: $\begin{matrix} {\frac{\partial{G(\theta)}}{\partial\phi_{1}}❘_{\theta = {\theta{({nL})}}}{\cong \frac{\begin{matrix} {{G\left\lbrack {{\phi_{1}({nL})},{\phi_{2}({nL})},{{\theta_{1}({nL})} + \Delta},{\theta_{2}({nL})}} \right\rbrack} -} \\ {G\left\lbrack {{\phi_{1}({nL})},{\phi_{2}({nL})},{\theta_{1}({nL})},{\theta_{2}({nL})}} \right\rbrack} \end{matrix}}{\Delta}}} & (18) \end{matrix}$

Step 5: Find the partial derivative: $\frac{\partial{G(\theta)}}{\partial\theta_{2}}❘_{\theta = {\theta{({nL})}}}$ at iteration n. To do this the parameter θ₂ is set at θ₂(nL)+Δ while the other parameters are left unchanged. The corresponding value of G(θ), i.e. G(φ₁(nL), φ₂(nL), θ₁(nL), θ₂(nL)+Δ], is computed as in Step 1 but in the time interval (nLT+4LT/5, (n+1)LT). The estimate of the partial derivative of G(θ) with respect to θ₂ is computed as: $\begin{matrix} {\frac{\partial{G(\theta)}}{\partial\phi_{2}}❘_{\theta = {\theta{({nL})}}}{\cong \frac{\begin{matrix} {{G\left\lbrack {{\phi_{1}({nL})},{\phi_{2}({nL})},{{\theta_{1}({nL})} + \Delta},{\theta_{2}({nL})}} \right\rbrack} -} \\ {G\left\lbrack {{\phi_{1}({nL})},{\phi_{2}({nL})},{\theta_{1}({nL})},{\theta_{2}({nL})}} \right\rbrack} \end{matrix}}{\Delta}}} & (18) \end{matrix}$

The above parameter update is done only after estimation of the gradient has been completed.

Note that in this case it is not necessary that the relationship between the control parameters of PC and optical rotators and the corresponding Jones matrices be known. Indeed, the partial derivatives of the function with respect to the compensator control parameters are computed without knowledge of this relationship. Consequently if the control parameters are different from those assumed as an example and are for example some voltage or some other angle, we may similarly compute the partial derivative and update these different control parameters accordingly.

Lastly, it is noted that when this algorithm is used the CD is not necessary and the controller must receive the estimated error only.

Second Method

When an accurate characterization of the PC and each optical rotator is available the updating rules may be expressed as a function of the estimated error and the signals on the two orthogonal polarizations at the compensator input.

Using the known stochastic gradient algorithm (for example as mentioned in the book Digital Communications by J. G. Proakis McGraw-Hill, New York, 1983) and substituting in (8) the error e(nL) with the corresponding estimated error e(nL) we find: $\begin{matrix} {{{\phi_{1}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\phi_{1}({nL})} - {2\gamma{\hat{e}({nL})}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\phi_{1}}}}}{{\phi_{2}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\phi_{2}({nL})} - {2\gamma{\hat{e}({nL})}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\phi_{2}}}}}{{\theta_{1}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\theta_{1}({nL})} - {2\gamma{\hat{e}({nL})}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\theta_{1}}}}}{{\theta_{2}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\theta_{2}({nL})} - {2\gamma{\hat{e}({nL})}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\theta_{2}}}}}} & (20) \end{matrix}$

In vector notation the expression (9) becomes: θ[(n+1)L]=θ(nL) 2γê(nL)∇y(t _(nL))   (21)

Before describing how the gradient of y(t_(n)L) is to be computed we introduce an equivalent model of the PMD compensator.

It was found that the PMD compensator shown in FIG. 1 can be modeled as an equivalent to a two-dimensional transverse filter using four tapped delay lines (IDL) combining the signals on the two principal polarization states (PSP). This equivalent model is shown in FIG. 2 where: $\begin{matrix} {{c_{1}\hat{=}{\cos\quad\theta_{1}\cos\quad\theta_{2}h_{1}}}{c_{2}\hat{=}{{- \sin}\quad\theta_{1}\sin\quad\theta_{2}h_{1}}}{c_{3}\hat{=}{{- \sin}\quad\theta_{1}\cos\quad\theta_{2}h_{2}^{*}}}{c_{4}\hat{=}{{- \cos}\quad\theta_{1}\sin\quad\theta_{2}h_{2}^{*}}}{c_{5}\hat{=}{\cos\quad\theta_{1}\cos\quad\theta_{2}h_{2}}}{c_{6}\hat{=}{{- \sin}\quad\theta_{1}\sin\quad\theta_{2}h_{2}}}{c_{7}\hat{=}{\sin\quad\theta_{1}\cos\quad\theta_{2}h_{1}^{*}}}{c_{8}\hat{=}{\cos\quad\theta_{1}\sin\quad\theta_{2}h_{1}^{*}}}} & (22) \end{matrix}$

For the sake of convenience let c(θ) designate the vector whose components are the c₁ in (22). It is noted that the tap coefficients c_(i) of the four TDLs are not independent of each other. On the contrary, given four of them the others are completely determined by (22). In the FIG for the sake of clarity it is designated β=1−α.

The partial derivatives of y(t_(n)L) appearing in (21) may be expressed as a function of the components on the two PSPs of the signal at the compensator input at some appropriate instants. The output sample y(t_(k)) may be written as (where (B)^(H) indicates the transposed conjugate of the matrix B): y(t _(k))=c ^(H) A(k)c   (23) where the Hermitian matrix A(k) is given by: A(k)=a(k)*a ^(T)(k)+b(k)*b ^(T)(k)   (24) with vectors a(k) and b(k) defined by: $\begin{matrix} {{a(k)} = {{\begin{pmatrix} {x_{1}\left( t_{k} \right)} \\ {x_{1}\left( {t_{k} - {\alpha\tau}_{c}} \right)} \\ {x_{1}\left( {t_{k} - \tau_{c}} \right)} \\ {x_{1}\left( {t_{k} - \tau_{c} - {\alpha\tau}_{c}} \right)} \\ {x_{2}\left( t_{k} \right)} \\ {x_{2}\left( {t_{k} - {\alpha\tau}_{c}} \right)} \\ {x_{2}\left( {t_{k} - \tau_{c}} \right)} \\ {x_{2}\left( {t_{k} - \tau_{c} - {\alpha\tau}_{c}} \right)} \end{pmatrix}\quad{b\left( t_{k} \right)}} = \begin{pmatrix} {x_{2}^{*}\left( {t_{k} - {2\tau_{c}}} \right)} \\ {x_{2}^{*}\left( {t_{k} - \tau_{c} - {\beta\tau}_{c}} \right)} \\ {x_{2}^{*}\left( {t_{k} - \tau_{c}} \right)} \\ {x_{2}^{*}\left( {t_{k} - {\beta\tau}_{c}} \right)} \\ {- {x_{1}^{*}\left( {t_{k} - {2\tau_{c}}} \right)}} \\ {- {x_{1}^{*}\left( {t_{k} - \tau_{c} - {\beta\tau}_{c}} \right)}} \\ {- {x_{1}^{*}\left( {t_{k} - \tau_{c}} \right)}} \\ {- {x_{1}^{*}\left( {t_{k} - {\beta\tau}_{c}} \right)}} \end{pmatrix}}} & (25) \end{matrix}$

Computing the gradient of y(t_(n)L), the algorithm (9) becomes: θ[(n+1)L]=θ(nL)−4γŷ(nL)Re{J ^(H) A(nL)c}  (26) where $\begin{matrix} {J\hat{=}\begin{pmatrix} \frac{\partial c_{1}}{\partial\phi_{1}} & \frac{\partial c_{1}}{\partial\phi_{2}} & \frac{\partial c_{1}}{\partial\theta_{1}} & \frac{\partial c_{1}}{\partial\theta_{2}} \\ \frac{\partial c_{2}}{\partial\phi_{1}} & \frac{\partial c_{2}}{\partial\phi_{2}} & \frac{\partial c_{2}}{\partial\theta_{1}} & \frac{\partial c_{2}}{\partial\theta_{2}} \\ \vdots & \vdots & \vdots & \vdots \\ \frac{\partial c_{8}}{\partial\phi_{1}} & \frac{\partial c_{8}}{\partial\phi_{2}} & \frac{\partial c_{8}}{\partial\theta_{1}} & \frac{\partial c_{8}}{\partial\theta_{2}} \end{pmatrix}} & (27) \end{matrix}$ is the Jacobean matrix of the transformation c=c(θ).

When the control parameters are different from those taken as examples we will have different relationships between these control parameters and coefficients c_(i). For example, if the PC is controlled by means of some voltages, given the relationship between these voltages and the coefficients h₁ and h₂ which appear in (3) by using the equations (22) we will always be able to express the coefficients ci as a function of these new control parameters.

Consequently in computing the gradient of y(t_(n)L) the only change we have to allow for is the expression of the Jacobean matrix J, which has to be changed accordingly as readily imaginable to those skilled in the art.

Lastly it is noted that when this second method is used the CD must receive the optical signals at the input of the compensator and the estimated error ê(nL). The CD must supply the controller directly with this estimated error and with the signal samples x₁(t) and x₂(t) at the desired instants.

It is now clear that the predetermined purposes have been achieved by making available an effective method for adaptive control of a PMD compensator and a compensator applying this method.

Naturally the above description of an embodiment applying the innovative principles of the present invention is given by way of non-limiting example of said principles within the scope of the exclusive right claimed here. 

1. Method for the adaptive adjustment of a PMD compensator in optical fiber communication systems with the compensator comprising a cascade of adjustable optical devices over which passes an optical signal to be compensated comprising the steps of: a. extracting the y₁(t) and y₂(t) on the two orthogonal polarizations of the signal at the compensator output, b. obtaining the signal y(t)=|y₁(t)|²+|y₂(t)|²; c. sampling the signal y(t) at instants t_(k)=kT with T=symbol interval to obtain samples y(t_(k)), d. computing the mean square error e(k)=y(tk)−u(k) with u(k) equal to the symbol transmitted or with u(k) replaced by a decision û(k) on the transmitted symbol u(k), and e. producing control signals for parameters of at least some of said adjustable optical devices to tend toward minimization of e(k).
 2. Method in accordance with claim 1 in which said parameters being consolidated in a vector θ and a function F(θ)=e(k) being defined the parameters are adjusted to tend to minimize the mean value of F²(θ).
 3. Method in accordance with claim 2 in which the vector θ being defined as: θ{circumflex over (=)}(φ₁, φ₂, θ₁, θ₂)^(T) where φ₁, φ₂, θ₁ and θ₂ are said parameters these parameters are updated by the rule: $\begin{matrix} \begin{matrix} {{\phi_{1}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{{\phi_{1}({nL})} - {\gamma\frac{{\partial E}\left\{ {F^{2}(\theta)} \right\}}{\partial\phi_{1}}}}❘_{\theta = {\theta{({nL})}}}}} \\ {= {{{\phi_{1}({nL})} - {2\gamma\quad{F(\theta)}\frac{\partial\left\{ {F(\theta)} \right\}}{\partial\phi_{1}}}}❘_{\theta = {\theta{({nL})}}}}} \\ {= {{\phi_{1}({nL})} - {2{\gamma\left\lbrack {{y\left( t_{nL} \right)} - {u({nL})}} \right\rbrack}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\phi_{1}}}}} \\ {{\phi_{2}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{{\phi_{2}({nL})} - {\gamma\frac{{\partial E}\left\{ {F^{2}(\theta)} \right\}}{\partial{\phi 2}}}}❘_{\theta = {\theta{({nL})}}}}} \\ {= {{{\phi_{2}({nL})} - {2\gamma\quad{F(\theta)}\frac{{\partial E}\left\{ {F(\theta)} \right\}}{\partial\phi_{2}}}}❘_{\theta = {\theta{({nL})}}}}} \\ {= {{\phi_{2}({nL})} - {2{\gamma\left\lbrack {{y\left( t_{nL} \right)} - {u({nL})}} \right\rbrack}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\phi_{2}}}}} \\ {{\theta_{1}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{{\theta_{1}({nL})} - {\gamma\frac{\partial{E\left( {F^{2}(\theta)} \right\}}}{\partial\theta_{1}}}}❘_{\theta = {\theta{({nL})}}}}} \\ {= {{{\theta_{1}({nL})} - {2\gamma\quad{F(\theta)}\frac{{\partial E}\left\{ {F(\theta)} \right\}}{\partial\theta_{1}}}}❘_{\theta = {\theta{({nL})}}}}} \\ {= {{\theta_{1}({nL})} - {2{\gamma\left\lbrack {{y\left( t_{nL} \right)} - {u({nL})}} \right\rbrack}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\theta_{1}}}}} \\ {{\theta_{2}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{{\theta_{2}({nL})} - {\gamma\frac{{\partial E}\left\{ {F^{2}(\theta)} \right\}}{\partial\theta_{2}}}}❘_{\theta = {\theta{({nL})}}}}} \\ {= {{{\theta_{2}({nL})} - {2\gamma\quad{F(\theta)}\frac{{\partial E}\left\{ {F(\theta)} \right\}}{\partial\theta_{2}}}}❘_{\theta = {\theta{({nL})}}}}} \\ {= {{\theta_{2}({nL})} - {2{\gamma\left\lbrack {{y\left( t_{nL} \right)} - {u({nL})}} \right\rbrack}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\theta_{2}}}}} \end{matrix} & (8) \end{matrix}$ where E{.} denotes “expected” and y>0 is a scale factor which controls the adjustment amount.
 4. Method in accordance with claim 2 in which the vector θ of the parameters is updated by adding a new vector with the norm proportionate to the norm of theradient of F²(θ) and with opposite direction, i.e. all its components have their sign changed so that the updating rule is: θ[(n+1)L]=θ(nL)−γ∇E{F ²(θ)}|_(θ=θ(nL))=θ(nL)−2γF(θ)∇E}F(θ)}|_(θ=θ(nL))   (9) so that movement is towards a relative minimum of the function F²(θ).
 5. Method in accordance with claim 2 in which in (8) only the sign of the error e(k)=y(t_(k))−u(k) and/or of the partial derivative is used.
 6. Method in accordance with claim 4 in which in the error F(θ)=e(k) the transmitted information symbol u(nL) is substituted with the corresponding decision û(nL) so as to substitute the error e(k) with the estimated error ê(k) defined as ê(k)=Y(t_(k))−û(k).
 7. Method in accordance with claim 6 in which G[θ(nL)]=E{e²(nL)} is defined so that the updating rule (9) becomes: θ[(n+1)L]=θ(nL)−γ∇G(θ)|_(θ=θ(nL))   (14)
 8. Method in accordance with claim 7 in which the. partial derivatives of G(θ) for θ=θ(t_(n)) are computed by the following five-step procedure: Step
 1. Find the value of G[θ(nL)]=G[φ₁(nL), φ₂(nL), θ1(nL), θ2(nL)] at iteration n; to do this, in the time interval (nLT, nLT+LT/5) an estimate of G[θ(nL)] is computed by averaging the L/5 values of the estimated square error, i.e., $\begin{matrix} {{G\left\lbrack {\theta({nL})} \right\rbrack} = \frac{\sum\limits_{i = 0}^{{L/5} - 1}\quad{\hat{e}\left( {{nL} + i} \right)}}{L/5}} & (15) \end{matrix}$ Step
 2. Find the partial derivative: $\frac{\partial{G(\theta)}}{\partial\phi_{1}}❘_{\theta = {\theta{({nL})}}}$ at iteration n; to do this, parameter φ₁ is set at φ₁(nL)+Δ while the other parameters are left unchanged; the corresponding value of G(θ), i.e. G[φ₁(nL)+Δ, φ₂(nL), φ₁(nL), φ₂(nL)] is computed as in step 1 but in the time interval (nLT+LT/5, nLT+2LT/5); the estimate of the partial derivative of G(θ) with respect to φ₁ is computed as: $\begin{matrix} {\frac{\partial{G(\theta)}}{\partial\phi_{1}}❘_{\theta = {\theta{({nL})}}}{\cong \frac{\begin{matrix} {{G\left\lbrack {{\phi_{1}({nL})},{+ \Delta},{\phi_{2}({nL})},{\theta_{1}({nL})},{\theta_{2}({nL})}} \right\rbrack} -} \\ {G\left\lbrack {{\phi_{1}({nL})},{\phi_{2}({nL})},{\theta_{1}({nL})},{\theta_{2}({nL})}} \right\rbrack} \end{matrix}}{\Delta}}} & (16) \end{matrix}$ Step
 3. Find the partial derivative: $\frac{\partial{G(\theta)}}{\partial\phi_{2}}❘_{\theta = {\theta{({nL})}}}$ at iteration n; to do this the parameter φ₂ is set at φ₂(nL)+Δ while the other parameters are left unchanged; the corresponding value of G(θ), i.e. G[φ1(nL), φ₂(nL)+′, θ₁(nL), θ₂(nL)], )], is computed as in step 1 but in the time interval (nLT+2LT/5, nLT+3LT/5); the estimate of the partial derivative of G(θ) with respect to φ₂ is computed as: $\begin{matrix} {\frac{\partial{G(\theta)}}{\partial\phi_{2}}❘_{\theta = {\theta{({nL})}}}{\cong \frac{\begin{matrix} {{G\left\lbrack {{\phi_{1}({nL})},{{\phi_{2}({nL})} + \Delta},{\theta_{1}({nL})},{\theta_{2}({nL})}} \right\rbrack} -} \\ {G\left\lbrack {{\phi_{1}({nL})},{\phi_{2}({nL})},{\theta_{1}({nL})},{\theta_{2}({nL})}} \right\rbrack} \end{matrix}}{\Delta}}} & (17) \end{matrix}$ Step
 4. Find the partial derivative: $\frac{\partial{G(\theta)}}{\partial\theta_{1}}❘_{\theta = {\theta{({nL})}}}$ at iteration n; to do this, parameter θ₁ is set at θ₁(nL)+Δ while the other parameters are left unchanged; the corresponding value of G(θ), i.e. G[φ₁(nL), φ₂(nL), θ₁(nL)+Δ, θ₂(nL)], )], is computed as in Step 1 but in the time interval (nLT+3LT/5, nLT+4LT/5); the estimate of the partial derivative of G(θ) with respect to G(θ) is computed as: $\begin{matrix} {\frac{\partial{G(\theta)}}{\partial\theta_{1}}❘_{\theta = {\theta{({nL})}}}{\cong \frac{\begin{matrix} {{G\left\lbrack {{\phi_{1}({nL})},{\phi_{2}({nL})},{{\theta_{1}({nL})} + \Delta},{\theta_{2}({nL})}} \right\rbrack} -} \\ {G\left\lbrack {{\phi_{1}({nL})},{\phi_{2}({nL})},{\theta_{1}({nL})},{\theta_{2}({nL})}} \right\rbrack} \end{matrix}}{\Delta}}} & (13) \end{matrix}$ Step
 5. Find the partial derivative: $\frac{\partial{G(\theta)}}{\partial\theta_{2}}❘_{\theta = {\theta{({nL})}}}$ at iteration n; to do this the parameter θ₂ is set at θ₂(nL)+Δ while the other parameters are left unchanged; the corresponding value of G(e), i.e. G[φ₁(nL), φ₂(nL), θ₁(nL), θ₂(nL)+Δ], is computed as in Step 1 but in the time interval (nLT+4LT/5, (n+1)LT); the estimate of the partial derivative of G(θ) with respect to θ₂ is computed as: $\begin{matrix} {\frac{\partial{G(\theta)}}{\partial\theta_{2}}❘_{\theta = {\theta{({nL})}}}{\cong \frac{\begin{matrix} {{G\left\lbrack {{\phi_{1}({nL})},{\phi_{2}({nL})},{\theta_{1}({nL})},{{\theta_{2}({nL})} + \Delta}} \right\rbrack} -} \\ {G\left\lbrack {{\phi_{1}({nL})},{\phi_{2}({nL})},{\theta_{1}({nL})},{\theta_{2}({nL})}} \right\rbrack} \end{matrix}}{\Delta}}} & (14) \end{matrix}$
 9. Method in accordance with claim 3 in which in (8) the error e(nL) is substituted with the estimated error e(nL) and the updating rules become: $\begin{matrix} {{{\phi_{1}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\phi_{1}({nL})} - {2\gamma{\hat{e}({nL})}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\phi_{1}}}}}{{\phi_{2}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\phi_{2}({nL})} - {2\gamma{\hat{e}({nL})}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\phi_{2}}}}}{{\theta_{1}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\theta_{1}({nL})} - {2\gamma{\hat{e}({nL})}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\theta_{1}}}}}{{\theta_{2}\left\lbrack {\left( {n + 1} \right)L} \right\rbrack} = {{\theta_{2}({nL})} - {2\gamma{\hat{e}({nL})}\frac{{\partial E}\left\{ {y\left( t_{nL} \right)} \right\}}{\partial\theta_{2}}}}}} & (20) \end{matrix}$
 10. Method in accordance with claim 4 in which the error e(nL) is substituted with the estimated error e(nL) so that the updating rule (9) becomes: θ[(n+1)L]=θ(nL)−2γê(nL)∇y(t _(nL))   (21)
 11. Method in accordance with claim 1 in which the PMD compensator is modeled as a two-dimensional transverse filter using four tapped delay lines (IDL) combining the signals on the two principal polarization states (PSP).
 12. Method in accordance with claims 2 and 11 in which the vector θ of said parameters is updated by adding a new vector with the norm proportionate to the norm of the gradient of F²(θ) and with opposite direction, i.e. all its components have their sign changed so that the updating rule is: θ[(n+1)L]=θ(nL)−4γŷ(nL)Re{J ^(H) A(nL)c}  (26) with the Hermitian matrix A(k) given by: A(k)=a(k)*a ^(T)(k)+b(k)*b ^(T)(k)   (24) with the vectors a(k) e b(k) given by: $\begin{matrix} {\begin{matrix} {{a(k)} = \begin{pmatrix} {x_{1}\left( t_{k} \right)} \\ {x_{1}\left( {t_{k} - {\alpha\tau}_{c}} \right)} \\ {x_{1}\left( {t_{k} - \tau_{c}} \right)} \\ {x_{1}\left( {t_{k} - \tau_{c} - {\alpha\tau}_{c}} \right)} \\ {x_{2}\left( t_{k} \right)} \\ {x_{2}\left( {t_{k} - {\alpha\tau}_{c}} \right)} \\ {x_{2}\left( {t_{k} - \tau_{c}} \right)} \\ {x_{2}\left( {t_{k} - \tau_{c} - {\alpha\tau}_{c}} \right)} \end{pmatrix}} & {{b\left( t_{k} \right)} = \begin{pmatrix} {x_{2}^{*}\left( {t_{k} - {2\tau_{c}}} \right)} \\ {x_{2}^{*}\left( {t_{k} - \tau_{c} - {\beta\tau}_{c}} \right)} \\ {x_{2}^{*}\left( {t_{k} - \tau_{c}} \right)} \\ {x_{2}^{*}\left( {t_{k} - {\beta\tau}_{c}} \right)} \\ {- {x_{1}^{*}\left( {t_{k} - {2\tau_{c}}} \right)}} \\ {- {x_{1}^{*}\left( {t_{k} - \tau_{c} - {\beta\tau}_{c}} \right)}} \\ {- {x_{1}^{*}\left( {t_{k} - \tau_{c}} \right)}} \\ {- {x_{1}^{*}\left( {t_{k} - {\beta\tau}_{c}} \right)}} \end{pmatrix}} \end{matrix}{and}} & (25) \\ {J\overset{\bigwedge}{=}\begin{pmatrix} \frac{\partial c_{1}}{\partial\phi_{1}} & \frac{\partial c_{1}}{\partial\phi_{2}} & \frac{\partial c_{1}}{\partial\theta_{1}} & \frac{\partial c_{1}}{\partial\theta_{2}} \\ \frac{\partial c_{2}}{\partial\phi_{1}} & \frac{\partial c_{2}}{\partial\phi_{2}} & \frac{\partial c_{2}}{\partial\theta_{1}} & \frac{\partial c_{2}}{\partial\theta_{2}} \\ \vdots & \vdots & \vdots & \vdots \\ \frac{\partial c_{8}}{\partial\phi_{1}} & \frac{\partial c_{8}}{\partial\phi_{2}} & \frac{\partial c_{8}}{\partial\theta_{1}} & \frac{\partial c_{8}}{\partial\theta_{2}} \end{pmatrix}} & (27) \end{matrix}$ with c₁, . . . , c₈ which are the tap coefficients of the four tapped delay lines and x1(t), x2(t) are the components of the two principal polarization states at the compensator input.
 13. Method in accordance with claim 1 in which said optical devices comprise a polarization controller with control angles φ₁, φ₂ and two optical rotators with rotation angles θ₁ and θ₂ and said parameters comprise said control angles φ₁, φ₂ and said rotation angles θ₁, θ₂ or functions thereof.
 14. Method in accordance with claim 13 in which between the controller and an optical rotator and between optical rotators there are fibers which introduce a predetermined differential unit delay maintaining the polarization.
 15. Compensator for PMD in optical fiber communication systems applying the method in accordance with any one of the above claims and comprising a cascade of adjustable optical devices over which passes an optical signal to be compensated and an adjustment system comprising a photodetector (17) which takes the components y₁(t) and y₂(t) on the two orthogonal polarizations from the signal at the compensator output, a sampler (19) which samples at instants t_(k)=kT with T=symbol interval, the signal y(t)=|y₁(t)|²+|y₂(t)|² at the output of the photodetector (17) to obtain samples y(t_(k)), a circuit (18, 20) for computation of the mean square error e(k)=y(t_(k))−u(k) with u(k) equal to the symbol transmitted and a regulator (15, 16) which regulates parameters of at least some of said optical devices to tend towards minimization of ek).
 16. Compensator in accordance with claim 15 characterized in that said optical devices comprise a polarization controller with control angles φ₁, φ₂ and two optical rotators with rotation angles θ₁ and θ₂ and in which said parameters which are adjusted consist of said control angles φ₁, φ₂ and said rotation angles θ₁ and θ₂.
 17. Compensator in accordance with claim 16 characterized in that between the controller and optical rotator and between optical rotators there are fibers which introduce a predetermined differential unit delay maintaining the polarization. 